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Abstract 

This paper presents a Cramer-Rao lower bound (CRLB) on the variance of unbiased estimates 
of factor matrices in Canonical Polyadic (CP) or CANDECOMP/PARAFAC (CP) decompositions 
of a tensor from noisy observations, (i.e., the tensor plus a random Gaussian i.i.d. tensor). A novel 
expression is derived for a bound on the mean square angular error of factors along a selected 
dimension of a tensor of an arbitrary dimension. The expression needs less operations for computing 
the bound, 0(NR 6 ), than the best existing state-of-the art algorithm, 0(N 3 R 6 ) operations, where 
N and R are the tensor order and the tensor rank. Insightful expressions are derived for tensors of 
rank 1 and rank 2 of arbitrary dimension and for tensors of arbitrary dimension and rank, where two 
factor matrices have orthogonal columns. 

The results can be used as a gauge of performance of different approximate CP decomposition 
algorithms, prediction of their accuracy, and for checking stability of a given decomposition of a 
tensor (condition whether the CRLB is finite or not). A novel expression is derived for a Hessian 
matrix needed in popular damped Gauss-Newton method for solving the CP decomposition of tensors 
with missing elements. Beside computing the CRLB for these tensors the expression may serve for 
design of damped Gauss-Newton algorithm for the decomposition. 
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I. Introduction 

Order-3 and higher-order data arrays need to be analyzed in diverse research areas such as 
chemistry, astronomy, and psychology HJ — 1[3]| . The analyses can be done through finding multi-linear 
dependencies among elements within the arrays. The most popular model is Parallel factor analysis 
(PARAFAC), also called Canonical decomposition (CANDECOMP) or CP, which is an extension of a 
low rank decomposition of matrices to higher-way arrays, usually called tensors. In signal processing, 
the tensor decompositions have become popular for their usefulness in blind source separation H. 

Note that a best-fitting CP decomposition may not exist for some tensors. In that case, trying to find 
a best-fitting CP decomposition results in diverging factors Q, (6j. This paper is focussed on studying 
CP decompositions of a noisy observations of tensors, which admit an exact CP decomposition. The 
decomposition of the noiseless tensor is taken as a ground truth for computing errors. 

An important issue is the essential uniqueness of CP decomposition as it entails identifiability of 
the model (the factor matrices) from the tensor. The adjective "essential" means that the model is 
unique up to a scale and permutation ambiguity, which is inherent to the problem. Initial works in 
the field can be traced back in 70's in works of Harshman 0, (3). A popular sufficient condition 
for the uniqueness was derived by Kruskal in |9). Recently, the problem has been addressed again, 
namely by Stegeman, Ten Berge, De Lathauwer, Jiang, Sidiropoulos et al.; see lTT0l - |[22l . 

This paper is focussed on stability of the CP decomposition rather than on the uniqueness. By 
stability we mean existence of a finite Cramer-Rao bound in a stochastic set-up, where tensor elements 
are corrupted by additive Gaussian-distributed noise. Relation of this kind of stability to a deterministic 
stability and to the uniqueness was studied in |[23l . It is not true, in general, that stability of a 
solution of a nonlinear problem implies uniqueness of the solution. For example, there might always 
be a permutation or sign ambiguity. It is yet an open theoretical question if stability of the CP 
tensor decomposition problem implies its essential uniqueness. Regardless of the missing link to 
identifiability, the stability is an interesting concept which is worth to be studied, because different 
kind of noise is very common. 

In general, in order to evaluate performance of a tensor decomposition, the approximation error 
between the data tensor and its approximate is commonly used. Unfortunately, such measure does 
not imply quality of the estimated components. In practice, in some difficult scenarios such as 
decomposition of tensor with linear dependency among components of factor matrices, or large 
difference in magnitude between components |[24l . |[25l . most CP algorithms explained the data 
tensor at almost identical fit, but only few algorithms can accurately retrieve the hidden components 
from the tensor |[26l . ll24l . In order to verify theoretically the quality of the estimated components and 
evaluate robustness of an algorithm, an appropriate measure is an essential prerequisite. The squared 
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angular error between the estimated component and its original one is such a measure E71 . ESI . 
Working with angular errors is practical, because the scaling ambiguity does not play a role. Only 
the permutation ambiguity has to be solved in practical examples, because order of the factor can be 
quite arbitrary. 

Cramer-Rao lower bound for CP decomposition was first studied in ||29l , and later, a more compact 
asymptotic expression was derived in |[30l for tensors of order 3 appearing in wireless communications. 
A non-asymptotic (exact) CRLB-induced bound (CRIB) on squared angular deviation of columns of 
the factor matrices with respect to their nominal values has been studied in ETll . Similar results for 
symmetric tensors are derived in OTTl . Nevertheless, the study is limited to the case of three-way 
tensors. In the general case, CRIB can be, indeed, calculated through the approximate Hessian which 
is often huge, and is impractical to directly invert. Note that such task normally costs 0(R 3 T 3 ) where 
T = J2 n In- Seeking a cheaper method for CRIB is a challenge to made it applicable. 

This paper presents new CRIB expressions for tensors of arbitrary dimension and rank, and 
specialized expressions for rank 1 and rank 2 tensors. The results rely on compact expressions for 
Hessian of the problem derived in |[26l . Alternative expressions for the Hessian exist in ll37l . Note, 
however, that unlike ll26l . this paper presents different expressions for inverse of the Hessian, which 
have lower computational complexity. In particular, complexity of inversion of the Hessian is reduced 
from 0(N 3 R (i ) operations to 0(NR 6 ), where N and R are the tensor order and the tensor rank, 
respectively. 

On basis of new discovered properties of the CRIB, we established connection between theoretical 
and practical results in CPD: 

• Stability of CPD for rank-1 and rank-2 tensors of arbitrary dimension. 

• The work may serve as theoretical support for a novel CP decomposition algorithm through tensor 
reshaping Il32l . which was designed to decompose high-dimensional and high-order tensors. In 
particular, it appears that higher-order orthogonally constrained CPD OBI . 041 . 051 . 061 can 
be decomposed efficiently through tensor unfolding. 

• Stability when factor matrices occur linear dependence problem and especially the rank-overlap 
problem (T), ll22l . ll34l . The problem is related to a variant of CPD for linear dependent loadings 
which was investigated in chemometric data and in flow injection analysis HI, 041 . A partial 
uniqueness condition of the related model is discussed in Il22l . 

• CP decomposition of tensors with missing entries, which is quite frequent in practice, is 
addressed. An approximate Hessian for this case is derived, which is the core for the damped 
Gauss-Newton algorithm for the decomposition. 

• A maximum tensor rank, given dimension of the tensor, which admits a stable decomposition is 
discussed. 
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The paper is organized as follows. Section II presents the main result, the Cramer-Rao induced bound 
on angular error of one factor vector in full generality. In Section III, this result is specialized for 
tensors of rank 1 and rank 2, and for the case when two factor matrices have mutually orthogonal 
columns. Section IV is devoted to a possible application of the bound: investigation of loss of accuracy 
of the tensor decomposition when the tensor is reshaped to a lower-dimensional form. Section V deals 
with the bound for tensors with missing entries, Section VI contains examples - CRIB computed for 
CP decomposition of a fluorescence tensor, stability of the tensor investigated by Brie et al, and a 
discussion of a maximum stable rank given the tensor dimension. Section VII concludes the paper. 

II. Presentation of the CRIB 

A. Cramer-Rao bound for CP decomposition 

Let y be an N— way tensor of dimension I\ x 1% X . . . X In. The tensor is said to be of rank R, 
if R is the smallest number of rank-one tensors which admit the decomposition of ^ of the form 

y = ^4 1) °a(. 2 )o...oaW (1) 

r=l 
(n) 

where o denotes the outer vector product, , r = 1, . . . , it, n = 1, . . . , N are vectors of the length I n 
called factors. The tensor in (OQ) can be characterized by N factor matrices A ra = [a^ ,a| , . . . , a^] 
of the size I n x R for n = 1, . . . , N. Sometimes (fl]) is referred to as a Kruskal form of a tensor ll43l . 

In practice, CP decomposition of a given rank (R) is used as an approximation of a given tensor, 
which can be a noisy observation "9 of the tensor y in CD). Owing to the symmetry of (1), we can 
focus on estimating the first factor matrix Ai, without any loss of generality, and we can assume 
that all other factor matrices have columns of unit norm. Then the "energy" of the parallel factors is 
determined by the squared Euclidean norm of columns of Ai. 

It is common to assume that the noise has a zero mean Gaussian distribution with variance a 2 , 
and is independently added to each element of the tensor in £[]). 

Let a vector parameter containing all parameters of our model be arranged as 

0= [(vecA 1 ) T ,...,(vecA 7V ) T ] T . (2) 
The maximum likelihood solution for consists in minimizing the least squares criterion 

Q{o) = \\y-y{e)\\ 2 F 0) 

where || • \\p stands for the Frobenius norm. 

We wish to compute the Cramer-Rao lower bound for estimating 6. In general, for this estimation 
problem, the CRLB is given as the inverse of the Fisher information matrix, which is equal to E71 

F(0) = ^J T (0)J(0) (4) 
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where 3(6) is the Jacobi matrix (matrix of the first-order derivatives) of Q(&) with respect to 0. In 
other words, the Fisher information matrix is proportional to the approximate Hessian matrix of the 
criterion, H(0) = 3 T (6)3(9). 

Let T nm denote the Hadamard (elementwise) product of matrices = A^ A&, k € {1, . . . , N} — 

{n,m}, 

Trim = ® Cfc , Cfe = A^Afe . (5) 

Theorem 1 |[26l : The Hessian H can be decomposed into low rank matrices under the form as 

H = G + Z K Z T (6) 

where K = [K nm ]^ m=1 contains submatrices K nm given by 

K nm = (1 - 5 nm )P R dvec (r nm ) (7) 

P# is the permutation matrix of dimension R 2 x R 2 defined in ll26l such that vec M = P# vec (M T ) 
for any R x R matrix M, and S nm is the Kronecker delta, and dvec(M) is a short-hand notation for 
diag(vec(M)), i.e. a diagonal matrix containing all elements of a matrix M on its main diagonal. 
Next, 

G = bdiag(r nn ^I 7 „)^ =1 (8) 

and 

Z = bdiag(I R ® A n )^ =1 (9) 

where <g> denotes the Kronecker product, Ij n is an identity matrix of the size I n x I n , and bdiag(-) is 
a block diagonal matrix with the given blocks on its diagonal. Note that the Hessian H in © is rank 
deficient because of the scale ambiguity of columns of factor matrices E51 . 11391 . It has dimension 

(i?En J n) X (^En 7 «) but itS rank is at mOSt R En J n ~ ( N ~ tfR- 

A regular (reduced) Hessian can be obtained from H by deleting (N — 1)R rows and corresponding 

(n) 

columns in H, because the estimation of one element in the vectors a r ', r = 1, . . . , R, n = 2, . . . , N 
can be skipped. The reduced Hessian may have the form 

H E = EHE T (10) 

where 

E = bdiag (I RIl , I R ® E 2 , . . . , I R ® Ejv) (11) 

and E n is an (I n — 1) x J n matrix of rank I n — 1. For example, one can put E n = [0g n _ 1 ) xl I/ n -i] 
for n = 2, . . . , N. With this definition of E n , H^; is a Hessian for estimating the first factor matrix 

(n) 

Ai and all other vectors a T - ', r = 1, . . . , R, n = 2, . . . , N without their first elements. In the sequel, 
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however, we use a different definition of E n . Note that each E n can be quite arbitrary, together 
facilitate a regular transformation of nuisance parameters, which does not influence CRLB of the 
parameter of interest. 

The CRLB for the first column of Ai, denoted simply as ai, is defined as a 2 times the left-upper 
submatrix of H^ 1 of the size I\ X I\, 

CRLB (ai ) = a 2 [H^] 1:/lil:/l . (12) 

Substituting Q in (TTOb gives 

li E = G E + Z E KZ% (13) 

where G E = EGE T and Z E = EZ. Inverse of H E can be written using a Woodbury matrix identity 
1551 as 

H" 1 = G^ 1 - G^ZeKQlnk + ZeG^ZeK^ZeG- 1 (14) 

provided that the involved inverses exist. 
Next, 

G E = bdiag(rii®Ii,r 2 2®(E2Ef),... ! r J vjv<8)(E iV E^)) (15) 
G^ 1 = bdiag((r 11 )- 1 ®I 1 ,r^ 1 ®(E 2 E^)- 1 ,...,r^ 1 7V ®(E^E^)- 1 ) . (16) 

Put 

* = ZeG^Ze (17) 
B = K(Itvr 2 + 'J/K) 1 (18) 

and let Bo be the upper-left R 2 x R 2 submatrix of B, symbolically Bo = Bi ; /?2 1:J? 2. Finally, let gu 
and gi >: be the upper-left element and the first row of rjQ , respectively. Then 

[Hg 1 ] = [Ge^I-Ju^-Ii + [G^^e] l:h ,1:R? B [G^ l Z E ] tl:R 2 

= gul h + (gi i: ^A 1 )B (gi, : (g)A 1 ) T . (19) 

The CRLB represents a lower bound on the error covariance matrix E[(ai — ai)(ai — ai) T ] for any 
unbiased estimator of ai. The bound is asymptotically tight in the case of Gaussian noise and least 
squares estimator, which is equivalent to maximum likelihood estimator, under the assumptions that 
the permutation ambiguity has been solved out (order of the estimated factors was selected to match 
the original factors) and scaling of the estimator is in accord with the selection of the matrix E. 
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B. Cramer- Rao -induced bound for angular error 

CRLB(ai) considered in the previous subsection is a matrix. In applications it is practical to 
characterize the error of the factor ai in the decomposition by a scalar quantity. In E8l it was 
proposed to characterize the error by an angle between the true and the estimated vector, and compute a 
Cramer-Rao-induced bound (CRIB) for the squared angle. The CRIB may serve a gauge of achievable 
accuracy of estimation/CP decomposition. Again, it is an asymptotically (in the sense of variance of 
the noise going to zero) tight bound on the angular error between an estimated and true factor. 

The angle a\ between the true factor ai and its estimate ai obtained through the CP decomposition 
is defined through its cosine 

ajai 

cos a\ = t. — ; . (20) 
ll a i|| ll a ill 

The Cramer-Rao induced bound for the squared angular error a\ [radians 2 ] will be denoted CRIB(ai) 
in the sequel. CRIB(ai) in decibels (dB) is then defined as -10 log 10 [CRIB(ai)] [dB]. 

Before computing CRIB(ai) we present another interpretation of this quantity. Let the estimate ai 
be decomposed into a sum of a scalar multiple of ai and a reminder, which is orthogonal to ai, 

ai = /3ai+ri (21) 

where /3 = a^ai/||ai|| 2 and ri = ai — /3ai. Then, the Distortion-to-Signal Ratio (DSR) of the 
estimate ai can be defined as 

DSR( *' ) = ?JjKF' (22) 

A straightforward computation gives 

DSR( ai ) = 1 ~ C |f 2 01 « a\ . (23) 

COS^ Ql 

The approximation in (1231 is valid for small af. We can see that CRIB(ai) serves not only as a bound 
on the mean squared angular estimation error, but also as a bound on the achievable Distortion-to- 
Signal Ratio. 



Theorem 2 E8l : Let CRLB(ai) be the Cramer-Rao bound on covariance matrix of unbiased estimators 
of ai. Then the Cramer-Rao-induced bound on the squared angular error between the true and 
estimated vector is 

, trTLf CRLB(ai)] 
CRIB ai = ai „ „» J (24) 
ll a i|| 

where 

11^ =I 7l - ai af/|| ai || 2 (25) 
is the projection operator to the orthogonal complement of ai and tr[.] denotes trace of a matrix. 
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Proof: A sketch of a proof can be found in ESI . It is based on analysis of a mean square angular 
error of a maximum likelihood estimator, which is known to be asymptotically tight (achieving 
the Cramer-Rao bound). Note that a conceptually more straightforward but longer proof would be 
obtained through the formula for CRLB on a transformed parameter, see e.g., Theorem 3.4 in Il42l . 
In particular, 

CRTB(ai) = G a (a 1 )CRLB(a 1 )G^(a 1 ) (26) 

where G a (ai ) is the Jacobi matrix of the mapping representing the angular error as a function of the 
estimate ai. 

Theorem 3: The CRIB(ai) can be written in the form 

2 

CRTB(ai) = -r^ {(/i - l) 5 n - tr [B ((gi>i, : ) ® Xi)] } (27) 
!l a i II 

where Bo is the submatrix of B in (TT8T ). Bo = Ri.r 2 ,i-.r 2 , 

/-i _ r~i ( n ) f~< ( n ) T 

n — {nj --,1 :,1 °> 

(n) (n) 

for n = 1,...,N, C\i and C. x denote the upper-right element and the first column of C n , 
respectively, and \I/ in the definition of B takes, for a special choice of matrices E n , the form 

* = bdiag(r n 1 0Ci,r 22 1 ®x 2 ,...,r^ 1 JV 0X JV ) . (29) 

Proof: Substituting (fT2l and ( fT9l into (1241 gives, after some simplifications, 

2 

CRIB(ai) = ^^tr [n^ (gnh, - (gi, : ® Ai) B (gi • g> Ai) 

{ (Ji - l) 5 n - tr [n^ (g 1>: ® Ai) B (gi ■ 8) Ai) T ] } 
{(/i - l) 5 n - tr [b ((gi>i, : ) ® (Afn^Ai))] } . (30) 



ll a ll 


2 






<T 2 




ll a l| 


2 






a 2 





This is (|27T ). because 



llaill 2 

Afn^Ai = Ci - -Lc^cJ^ = Xi . (31) 

Next, assume that E is defined as in (TTTb . but E n are arbitrary full rank matrices of the dimension 
(I n - 1) x I n . Then, combining (TTTb . ©, CED and dHJ) gives 

* = 7? E G E l Z E = bdiag (r n x ® Ci, T^ 1 ® X 2 , . . . , T^ Xjy) (32) 

where 

X n = AK(E n E^)~ 1 E n A n (33) 

for n = 2, . . . , N. Note that the expression E^(E n E^) _1 E n is an orthogonal projection operator to 
the columnspace of E^. If E n is chosen as the first (I n — 1) rows of 

ni.,=I 7n -i4 B) ai B)T /l|a? l) || 2 (34) 
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then E^(E n E^) _1 E n = Il\ n) and consequently X n = A^U^An = X n . ■ 

a i a i 

Note that the first row and the first column of X n are zero. 
Theorem 4: Assume that all elements of the matrices C n in © are nonzero. Then, the matrix Bo 
in Theorem 3 can be written in the form 

B = [-I R 2 + V(I R 2 + V)~ X ]Y (35) 

where 

V = W-Yfr^d) (36) 

N 

W = P i? ^dvec(r ln )S^ 1 (r m 1 ®X„)dvec(C 1 0C n ) (37) 

n=2 
N 

Y = P R ^dvec(r ln )S; 1 (r- n 1 ®X ri )Pi ? dvec(r ln ) (38) 

n=2 

S n = Ijp - (T^ 1 X n )dvec(r nn C n )P R , n = 2,...,N. (39) 

In (l37l) and (l39l) . "0" stands for the element-wise division. 
Proof: See Appendix B. 

Note that in place of inverting the matrix B of the size NR 2 x NR 2 , Theorem 4 reduces the 
complexity of the CRIB computation to N inversions of the matrices of the size R 2 x R 2 . The 
Theorem can be extended to computing the inverse of the whole Hessian in 0(NR 6 ) operations, see 

ma. 

Finally, note that the assumption that elements of C n must not be zero is not too restrictive. 
Basically, it means that no pair of columns in the factor matrices must be orthogonal. The Cramer- 
Rao bound does not exhibit any singularity in these cases, and is continuous function of elements 
of C„. If some element of C„ is closer to zero than say 10~ 5 , it is possible to increase its distance 
from zero to that value, and the resultant CRIB will differ from the true one only slightly. 
Theorem 5 (Properties of the CRIB) 

1) The CRIB in Theorems 3 and 4 depends on the factor matrices A n only through the products 
C - A r A 

2) The CRIB is inversely proportional to the signal-to-noise ratio (SNR) of the factor of the 
interest (i.e. ||ai || 2 /(a 2 Ii)) and independent of the SNR of the other factors, ||a r || 2 /(cr 2 I r ), 
r = 2,...,R. 

Proof: Property 1 follows directly from Theorem 3. Property 2 is proven in Appendix C. 
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III. Special cases 

A. Rank 1 tensors 

In this case, the matrix Xi is zero, and 

2 2 

CRIB(ai) = -^-r, (h - l) 5 n = {h ~ 1) • (40) 

ll a lll ll a lll 

In (l40l . gn = 1 due to the convention that the factor matrices A n , n > 2, have columns of unit 
norm. The result (1401 is in accord with Harshman's early results on uniqueness of rank-1 tensor 
decomposition ||8l. 

B. Rank 2 tensors 

Consider the scaling convention that all factor vectors except the first factor have unit norm. Let 

Cn> \cn\ < 1> be defined as 

' (aS n) ) T a5 n) for n = 2,...,N 

(41) 

(aWfa^/CllaW || llalll) for n = l . 
It follows from Theorem 5 that the CRIB on ai is a function of ci, . . . ,cjv multiplied by cr 2 /||ai|| 2 . 
It is symmetric function in C2, ... ,cjv and possibly nonsymmetric in c\. A closed form expression 
for the CRIB in the special case is subject of the following theorem. 
Theorem 6 It holds for rank 2 tensors 

(l-<? 1 )h 2 1 ly 2 + z-iqz(z + l)] 



CRIB(ai) = l^T^f 

where 



1 



{l- Cl y-hl{z + l)Y-hl{y + Cl zY 



(42) 



A? 



K = J] Cn for n = l,...,JV (43) 

2<fc^n 

n=2 n n 1 



^ 1 2 
* — > 1 — C 



(45) 



n=2 

Proof: See Appendix D. 

Note that the expressions (I44l -(|43T) contain, in their denominators, terms c n — h n c\. If any of these 
terms goes to zero, then quantities y and z go to infinity. In despite of this, the whole CRIB remain 
finite, because y and z appear both in the numerator and denominator in (l42l . 

For example, for order-3 tensors (N = 3) we get (using e.g., Symbolic Matlab or Mathematica) 

a 2 1 



CRIB Ar=3 (ai) 



|ai|| 2 I -h 2 



r 2 r 2 

h - 1 + 2 + 3 
1 — Co 1 — 6 



(46) 
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The above result coincides with the one derived in E71 . As far as the stability is concerned, the CRIB 
is finite unless either the second or third factor have co-linear columns. Note that the fact that the 
CRIB for ai does not depend on c\ can be linked to the uni-mode uniqueness conditions presented 
in 11221 . 

For N = 4, the similar result is hardly tractable. Unlike the case N = 3, the result depends on c\. 
A closer inspection of the result shows that the CRIB, as a function of c\, achieves its maximum at 
ci = 0, and minimum at c\ = ±1. Therefore we shall treat these two limit cases separately. 

We get 



CRIB 



7V=4,Ci: 



=o(ai) 



a 



1 



l a l| 



CRIB JV= 4, Cl =±i(ai) 



ll a i| 



1-hf 



1 



44 + C 2 C 4 + C 3 C 4 ^444 



„2„2 



2^2 



„2„2„2 



^444 



„2 2 
C 2 L 3 



,2„2 



CSC 



2 "-4 



44 + 1 



(47) 



ll a i| 



1 

T-fv( 



h 



h 



1 I L -2^ L "4 z,t '2 L '4 

1 + (l-^)(l-cj) 



r 2 -\-r 2 — r >r 2 r 2 



for (|c 2 | < l)&(|c 3 | < 1)&(M < 1) 
for I C4 1 = 1 



for 



(48) 



l c 3| 



for I02I = 1 . 



HaiH' l-tii L _i " ' (l-cl)(l-c^)_ 

As far as the stability is concerned, we can see that the CRIB is always finite unless two of the factor 
matrices have co-linear columns. 

Similarly, for a general N, we have for c\ = 



CRffi Cl=0 (ai) 



(7 



1 



|ai|| 2 I -hj 



1 + 



h\z 



1- h\ (z + 1) 



(49) 



C. A case with two factor matrices having orthogonal columns 

This subsection presents a closed-form CRIB for a tensor of a general order and rank, provided 
that two of its factor matrices have mutually orthogonal columns. The result cannot be derived from 
Theorem 5, because assumptions of the theorem are not fulfilled. 

Theorem 7 When the factor matrices Ai and A2 both have mutually orthogonal columns, it holds 



CRIB(ai) 



a i 



R 



1 + £r 



r=2 



l 2 r 



(50) 



where 7r = n^^fa^ far r = 2, .... A 
Proof: See Appendix E. 

Theorem 7 represents an important example when a tensor reshaping (see Section V.A. and B21 
for more details) enables very efficient (fast) CP decomposition without compromising accuracy. It 
has close connection with orthogonally constrained CPD P4l . ll35l . ||36*1 . 
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Tr(n,m) 



(52) 



IV. CRIB FOR TENSORS WITH MISSING OBSERVATIONS 

It happens in some applications, that tensors to be decomposed via CP have missing entries (some 
observations are simply missing). In this case, it is possible to treat stability of the decomposition 
through the CRIB as well. The only problem is that it is not possible to use expressions in Theorems 
3-8 in such cases. 

Assume that the tensor to be studied is given by its factor matrices Ai,...,Ajv and a 0-1 
"indicator" tensor W of the same dimension as y, which determines which tensor elements are 
available (observed). The task is to compute CRIB for columns of the factor matrices, like in the 
previous sections. The CRIB is computed through the Hessian matrix H as in (12) and (20), but its 
fast inversion is no longer possible. The Hessian itself can be computed as in its earlier definition 

H = Jl(9)J w (e), J w (0)= dveC %® W) (51) 

where is the parameter of the model ([2]). More specific expressions for the Hessian can be derived 
in a straightforward manner. 

Theorem 8: Consider the Hessian for tensor with missing data as an N x N partitioned matrix 

H = [H^ m )]^ >m=1 where H^ m ) = [Hft m) ]^ s=1 e Then 

'diag(Wx_ n {a^©a«,...,a^©ai 7V )}) , n = m, 

^ (a^ ai m > T ) © (W x _ {rt/m} {a« © a« , • • • , a<"» © a™ }) , n^m 
yx n u„ denotes the mode-n tensor-vector product between y and u n (4|, and 

yx_„{u} = y XiUi • • • x n _iu n _ix n+ iu n+ i • • • x n vl N . (53) 

Proof: See Appendix F. 

Theorem 8 can be used either to compute the CRIB for tensors with missing elements, or for 
implementing damped Gauss-Newton method for finding the decomposition in difficult cases, where 
ALS converges poorly. 

V. Application and Examples 
A. Tensor decomposition through reshape 

Assume that the tensor to-be decomposed is of dimension N > 4. The tensor can be reshaped 
to a lower dimensional tensor, which is computationally easier to decompose, so that the first factor 
matrix remains unchanged. The topic will be better elaborated in our next paper [32], in this paper 
we present only the main idea on two examples, to demonstrate usefulness of the CRIB. 

In the first example, consider N = 4. The tensor in £[]) can be reshaped to an order-3 tensor 

y res = f;aWoa( 2 )o(a( 4 )©a( 3 )) . (54) 

r=l 
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Both the original and the re-shaped tensors have the same number of elements (I1I2I3I4) ar >d the 
same noise added to them. 

The question is, what is the accuracy of the factor matrix of the reshaped tensor compared to the 
original one. The latter accuracy should be worse, because a decomposition of the reshaped tensor 
ignores structure of the third factor matrix. The question is, by how much worse. If the difference 
were negligible, then it is advised to decompose the simpler tensor (of lower dimension). 

If the tensor has rank one, accuracy of both decompositions is the same. It is obvious from (29). 

Let us examine tensors of rank 2. If the original tensor has correlations between columns of the 
factor matrices c\, C2, C3 and C4, the reshaped tensor has correlations c\, C2, and C3C4, respectively. 
CRIB(ai) of the reshaped tensor is independent of ci, while CRIB of the original tensor is dependent 
on c\, so there is a difference, in general. The difference will be smallest for c\ = (orthogonal 
factors) and largest for c\ close to ±1 (nearly or completely co-linear factors along the first dimension). 

The smallest difference between CRIB(ai) for the reshaped tensor and for the original one is 

C 2 \ 4 + 44 — l444 44 + C 2 C 4 + 44 — ^444 



IMP L (i - 4)(i - 44) (i - 444x2444 - 44 - 44 - 44 + i)J 

and the largest difference is 



a 2 



ai 
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4 + 44 %444 
(i-d)(i-44) 



a 2 



laill 2 



2^2 



csc: 



3^4 



1 c| 1 44 



We can see that the difference may be large if the second or third factor matrix of the reshaped 
tensor has nearly co-linear columns (4 ~ 1 or c 2 c| 1) . For example, for a tensor with I\ = 5, 
c\ = 0, C2 = 0.99, C3 = C4 = 0.1 the loss of accuracy in decomposing reshaped tensor in place 
of the original one is 11.22 dB. If c\ is changed to 1, the loss is only slightly higher, 11.23 dB. 
If c\ = C2 = 0, the loss is dB for any C3, C4 (compare Theorem 7). If c\ = 1, C2 = and 
c 3 = c 4 = 0.99, the loss is 8.5 dB. 

Another example is a tensor of an arbitrary order and rank considered in Theorem 7. Let this 
tensor be reshaped to the order-3 tensor of the size I\ x I 2 x (^3 . . . In). Comparing the CRIB(ai) 
of the original tensor and of the reshaped tensor shows that these two coincide. It follows that the 
decomposition based on reshaping is lossless in terms of accuracy. 

B. Amino Acids Tensor 

A data set consisting of five simple laboratory-made samples of fluorescence excitation-emission 
(5 samples x 201 emission wavelengths x 61 excitation wavelengths) is considered. Each sample 
contains different amounts of tryptophan, tyrosine, and phenylalanine dissolved in phosphate buffered 
water. The samples were measured by fluorescence on a spectrofluorometer PTI . Hence, a CP model 
with R = 3 is appropriate to the fluorescence data. 
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TABLE I 

Estimated CRIBs [dB] on best fit CP components of fluorescence tensor computed for assumed rank 

R = 1,2,3,4 



Factor R = l R = 2 R = 3 R = 4 



n 


1 


1 


2 


1 


2 


3 


1 


2 


3 


4 


1 


44.43 


44.44 


41.87 


64.76 


61.34 


64.98 


65.78 


60.96 


65.77 


38.17 


2 


27.44 


30.28 


27.71 


53.15 


50.17 


49.60 


54.33 


51.39 


50.87 


23.29 


3 


32.67 


36.23 


33.66 


58.96 


55.75 


54.87 


60.25 


56.28 


54.27 


25.74 



The tensor was factorized for several possible ranks R using the fLM algorithm B261 . CRIBs on 
the extracted components were then computed with the noise levels deduced from the error tensor 

= n / ' (55) 

lln J ™ 

The resultant CRIB's are computed for all columns of all factor matrices and are summarized in 
Table 1. 

Note that due to the "— 101og 10 " definition, high CRIB in dB means high accuracy, and vice versa. 
A CRIB of 50 dB means that the standard angular deviation (square root of mean square angular 
error) of the factor is cca 0.18°; a CRIB of 20 dB corresponds to the standard deviation 5.7°. 

The second mode to the decomposition, which represents intensity of the data versus the emission 
wavelength, for R = 2, 3, 4 and 8 is shown in Figure 1. We can see that the CRIB allows to distinguish 
between strong/significant modes of the decomposition and possibly artificial modes due to over-fitting 
the model. The criterion is different in general than the plain "energy" of the factor; if a factor has 
a low energy, it will probably have high CRIB, but it might not hold true vice versa. A high energy 
component might have a high CRIB. 

In the next experiment, we have studied how much the accuracy of the decomposition is affected 
in case that some data are missing (not available). The decomposition with the correct rank R = 3 
and a 2 estimated as in (1551 ) was taken as a ground truth; the 0-1 indicator tensor W of the same size 
was randomly generated with a given percentage of missing values. The CRIB of the second mode 
factors was plotted in Figure 2 as a function of this missing value rate. The figure also contains mean 
square angular error of the components obtained in simulations. Here an artificial Gaussian noise with 
zero mean and variance a 2 was added to the "ground truth" tensor. The decomposition was obtained 
by a Levenberg-Marquardt algorithm |[26l modified for tensors with missing entries. 

A few observations can be made here. 

• CRIB coincides with MSAE for the percentage of the missing entries smaller than 70%. If the 
percentage exceeds the threshold, CRIB becomes overly optimistic. 
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(c) Estimated components as R = 4. (d) Estimated components as R = 8. 

Fig. 1. Illustration for emission components from best-fit decompositions over 100 Monte Carlo runs for example VI-A. 
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Fig. 2. CRIB for the second-mode components of CP decomposition of tensor in section VIA with missing elements and 
mean square angular error obtained in simulations versus percentage of the missing elements. 



• In general, accuracy of the decomposition declines slowly with the number of missing entries. 
If the number of missing entries is about 20%, loss of accuracy of the decomposition is only 
about 1-2 dB. 
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C. Stability of the decomposition of Brie 's tensor 

Brie et al |[20l presented an example of a four-way tensor of rank 3, which arises while studying the 
response of bacterial bio-sensors to different environmental agents. The tensor has co-linear columns 
in three of four modes and the main message of the paper is that its CP decomposition is still unique. 
In this subsection we verify stability of the decomposition. 

The factor matrices of the tensor have the form 



[ai,a 2 ,a 3 J 



[a 4 ,a 4 ,a 5 ], A a 



Assume for simplicity that all factors have unit norm, ||a 
holds that CRIB on ai is a function of scalars c\\ 



a 6 , a 7 , a 6 ] A 4 = [a 8 , a 9 , a 9 ] . 

t || = 1, n = 1, . . . , 9. Due to Theorem 5 it 



a 1 r a 2 , C12 = afa 3 , ci 3 = a^a 3 , c 2 = aja 5 , 
c 3 = a^ay, c 4 = agag and I\, which is the dimension of a 4 . Then, the matrices C n = A^A n , 
n = 2, 3, 4, have the form 

1 1 C 2 1 C 3 1 1 C 4 C 4 

1 1 C 2 , C 3 = C 3 1 C 3 , C4 = C4 1 1 
C 2 C 2 1 1 C 3 1 C 4 1 1 

A straightforward usage of Theorem 4 is not possible, because some of the involved matrices become 
singular. The CRIB itself, however, is finite and can be computed using an artificial parameter e as 
a limit. The limit CRIB is computed for modified matrices at e — > 0, 



'2 = 



1 1 — £ C 2 

1 — e 1 c 2 
c 2 c 2 1 



'3e 



,c 



lt- 



1 C4 
C4 1 
c 4 1 — e 



C4 

1 -e 
1 



1 c 3 1 - e 
c 3 1 c 3 
1 - e c 3 1 

If any of the correlations c 2 , c 3 , C4 is zero, it is also augmented by e. 

The limit CRIB can be shown to be independent of off-diagonal elements of Ci, unless Ci is 
singular. Assume that Ci is regular. The result, obtained by Symbolic Matlab, is 

' 2 1 



CRIB e= o(a!) 



a 



1 1| ^C^C^C^ ^2^3 



2^2 



2°4 
4^2 



c|c? + 1 



C 4( c 2 + l)-3c| + l c 4( c 2 + i)_ 3c 2 + l 2-ci-q 



1 



+ 



1 



(56) 



It follows that the decomposition is stable, unless all three factors in some mode are collinear. 



D. Maximum Stable Rank 

A theoretically interesting question is, what is the maximum rank of a tensor of a given dimension 
which has a stable CP decomposition (with finite CRIB). For easy reference, we shall call it maximum 
stable rank and denote it R sm ax(Ii, ■ ■ ■ , In)- 
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An upper bound for the maximum stable rank can be deduced from the requirement that the number 
of free parameters in the model, which is R(Yln=i In ~ N + 1) in CP decomposition, cannot exceed 
dimension of the available data, which is IT^Li In- It follows that 



Rsmax(h, ■ ■ • j In) < 



lln=l 1 n 



En=lIn-N+\ 



(57) 



where [x I denotes the lower integer part of x. It can be verified numerically that for many (and 
maybe al\j) tensor dimensions, an equality in (I57T ) holds. In other words, it means that the CRIB 
computed, e.g., via Theorem 4 for a CP decomposition with rank R = R sma x and some (e.g. random) 
factor matrices is finite. For example, the maximum stable rank is R srnax = 2 for 2 x 2 x 2 tensors, 
and R S rnax = 3 for 3 x 3 x 3 tensors. For order-8 tensors of dimension 2 x . . . x 2, (8x), it holds 

Rsmax — 28. 

It might be interesting to compare the maximum stable rank with the maximum rank and the 
maximum typical rank (to be explained below) for given tensor dimension, if they are known iPBl . 
If the elements of a tensor are chosen randomly according to a continuous probability distribution, 
there is not a rank which occurs with probability 1 in general. Such rank, if exists, is called generic. 
Ranks which occur with strictly positive probabilities are called typical ranks. For example it was 
computed in |[T0l that probability for a real random Gaussian tensor of the size 2 x 2 x 2 to be 2 
and 3 is 7r/4, and 1 — 7r/4, respectively. We can see that no tensor of the rank 3 and the dimension 
has a stable decomposition. For tensors of the dimension 3x3x3 the typical rank is 5 ifTOl . it is a 
generic rank - but no decomposition of these rank-5 tensors is stable, as R sm ax = 3. 

Next, it might be interesting to compare the maximum stable rank with the maximum rank for 
unique tensor decomposition, or prove that these two coincide. Liu and Sidiropoulos ifTTll . |[29l derived 
a necessary condition for uniqueness of the CP decomposition, which, according to a formulation in 
IH reads 

min rank (Ai . . . A n _i A n+ i ... Ajv) = R (58) 

n=l,...,N 



where means the Khatri-Rao product. The condition (1581 ) is equivalent to the condition that the 
matrices H n = Ai ... A n _i A n+ i ... Ajv have all full column rank, n = 1, . . . , N, which 
is further equivalent to the condition that the product H^3 n are regular for n = 1, . . . , N. Finally 
note that 

H T H — r r? — 1 N 

where Y nn was defined in (f5]) and appears in computation of the CRIB. 



1 We do not have yet a formal proof that the equality in J57t holds for all tensor dimensions and orders. 
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Unfortunately, it appears that the condition (1581 ) is only necessary, but not sufficient for uniqueness. 
It is often fulfilled for R higher than R sm ax- Thus a relation between the stability and uniqueness of 
the CP decomposition remains open question for now. 

VI. Conclusions 

Cramer-Rao bounds for CP tensor decomposition represent an important tool for studying accuracy 
and stability of the decomposition. The bounds derived in this manuscript serve as a theoretical support 
for a method of the decomposition through tensor reshaping |[32l . As a side result, a novel method 
of inverting Hessian matrix, which is more computationally efficient, is derived for the problem. It 
enables a further improvement of speed of the fast Gauss-Newton for the problem ESI . A novel 
expression for Hessian for CP decomposition of tensor with missing entries has been derived. It can 
serve for assessing accuracy of CP decomposition of these tensors without need of long Monte Carlo 
simulations, and for implementing a damped Gauss-Newton algorithm for CP decomposition of these 
tensors. 

A direct link between stability and essential uniqueness remains to be an open theoretical question. 
In particular, it is not known yet for sure if stability implies the essential uniqueness. 

CRB expressions similar to the ones derived in this paper can be also derived for other important 
special tensor decomposition models such as INDSCAL (where two or more factor matrices coincide) 
ED, El, or for the PARALIND model, where the factor matrices have certain structure l22l . and 
for block factorization methods. 

Appendix A 
Matrix Inversion Lemma (Woodbury identity) 

Let A, X, Y, and R are matrices of compatible dimensions such that the following products and 
inverses exist. Then 

(A + XRY)- 1 = A" 1 - A" 1 X(R" 1 + YAT^X^YA -1 . (59) 

Appendix B 

Proof of Theorem 4 

Let the matrices K and \I/ in (fT8l) be partitioned as 






Ki 

















K 2 







*2 



where the left-upper blocks have the size R 2 x R 2 . Then, using a formula for inverse of partitioned 
matrices, the left-upper block of B in (fT8l) can be written as 

B = Ki(I (JV _ 1)JP + * 2 K 2 - * 2 Kf *!Ki) ^ 2 Kf = KiKg 1 * 2 K^ . (61) 
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A key observation which enables a fast inversion of the term K3 is that 



K 



K + DFD 



where 



K 
F 
D 



-bdiag(P i? F(dvec(l0C n )) 



2\N 
>n=l 



N 



P R H dvec(C n ) = P R dvec(r n © 



n=l 



[dvec(l Ci), . . . ,dvec(l C N )Y 



Similarly, 



where 



K 5 



K 02 + D 2 FD^ 



(62) 

(63) 
(64) 
(65) 

(66) 

(67) 
(68) 



K 02 = -bdiag(P R F(dvec(l0C n )) 2 , ? 
D 2 = [dvec(l0C 2 ),...,dvec(l0Cjv)] T 
Then the matrix K3 in (I6TT) can be written as 

K 3 = I (JV -l)i? 2 + *2K 2 - * 2 Kf *iKi 

= I (N _ 1)R 2 + * 2 (K 02 - Kj *iKj) + * 2 D 2 FD 2 n 

= Q + * 2 D 2 FD^ (69) 

where 

Q = bdiag(Q„)^ =2 (70) 
Q n = I R 2 - (r^ X n )P R (F(dvec(l C n )) 2 + dvec(r ln )(r^ C 1 )dvec(r ln )P Ji )(71) 
Now, K3 can be easily inverted using the matrix inversion lemma (l59l . 

K3 1 = Q 1 -Q 1 D 2 r (I, ?2 +D 2 r Q- 1 * 2 D 2 F)- 1 * 2 D 2 FQ- 1 . (72) 
Inserting (1721) in (l6Tb gives, after some simplifications, the result (I33T ). ■ 

Appendix C 

Proof of Theorem 5 

Consider the change of scale of columns of factor matrices up to their first columns. As in Section II 
assume that the scale change is realized in Ai, while the other factor matrices have columns of unit 
norm. The theorem claims that the substitution Ai <f— AiD into (|27T ) where D = diag(l, A 2 , . . . , Xr), 
X r ^ 0, has no influence on CRIB(ai). 
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The substitution Ai <- A X D leads to Ci <- DCiD and Xi <- DX X D while C n and X n , 
n = 2, . . . , N, remain the same. Consequently, Fi n , n = 1, . . . , N, remain unchanged while Y nn 
Dr„„D for n = 2, . 



N. Now, we can substitute into (1351) assuming that the condition of Theorem 4 
is satisfied. 

Let S n denote the matrix S n in ( [39b after the substitution Ai <— AiD. It can be shown that 
(D® (D I R ) using the rules 

(Dr^D)- 1 ® X n , = (D" 1 ® I R )(T-^ X n )(D- x ® (73) 

dvec(Dr nn D C n ) = (D D)dvec(r n „ C n ) (74) 

(la D)P fi = P i? (D (75) 

and the fact that diagonal matrices commute. Using the same rules in further substitutions, after some 
computations, the independence of CRIB(ai) on D follows. 

Appendix D 

Proof of Theorem 6 

Again, assume for simplicity that all factors have unit norms. It holds 







i 


hi 




in = 




; X n — 






1 





and 



5n 

gl,: 



1 



5ii [1, 



n = 1 , . 



N . 



1 



\-h\ 
-hi] . 



The matrix \I/ in (1321) can be decomposed as \I/ = J<& where 
J = bdiag(l 4 ,I 2 [0, 1] T ,...,I 2 0[O, 1] T ) 

* = bdiag^^Ci^l-cDr^ 1 ®^, 1],...,(1- c 2 N )T- l N 0[O, 1]) . 
Then the matrix B in (TT8T ) can be rewritten using the Woodbury identity (l59l ) as 

B = K(I 4 tv + J*K) 1 = K - KJ(I 2 tv+2 + *KJ) . 
Now, put B4 = I2N+2 + ^KJ and write it in the block form as 

B41 B42 
B43 B44 

where B41 has the size 4x4. The bottom-right block B44 of dimension (2N — 2) x (2./V 
easy to be inverted using the Woodbury identity again, because it can be written as 



B, 



I27V+2 + *KJ 



(76) 
(77) 

(78) 
(79) 

(80) 

(81) 
2) is 



B 



11 



B 5 + sf 1 



(82) 
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where 



B 



B 5 = bdiag(B 52 , . . . ,B 5A r) 
1 




5/i 



h n d{l-cl) 
l-hlci 



cl-hjcj 
l-hlci 



n 



N 



h 2Cl (i-4) (i-4) 



\-h\c\ ' 1 - ^cf ' ' 
f = [0, 1, 0, 1,...,1] T . 

After some computations, we receive the result 



2 \lT 



h N q{l - c 2 N ) (1 - c%) 
1 — h>N c l 1 ~~ ^N c l 



(83) 
(84) 

(85) 
(86) 



Appendix E 



Proof of Theorem 7 



Under the assumption of the Theorem, it holds that the matrix Ci = A^Ai is diagonal and C 2 = 
Ir (identity matrix). Thanks to Theorem 5 we can assume, without any loss of generality, that Ci = Ir 
as well. It can be shown for Y mn in ((5]) that T mn = Ir for all pairs (m,n), (m,n) ^ (1,2), (2, 1). 
Only Ti2 and T21 = T12 are possibly different. Note that the first row of Ti2 is (1,72, • • • ,7tv)- 

It follows from these observations that all non-diagonal R 2 x R 2 blocks K mn of K in Q with 
(m, n) ^ (1, 2), (2, 1) are identical, diagonal, having 1 at positions (p,p), p = 1, R+l, 2R+2, . . . , R 2 
and elsewhere. In other words, these K mn can be written as K mn = QQ T , where Q is a 0-1 matrix 
of the size R 2 x R, the p— th column of Q has the value 1 at position (p — 1)(R + 1) + 1 and 
elsewhere. 

Computation of the CRIB can proceed from equation (l6Tb by inserting the special form of the 
blocks of Ki and K2 and using the Woodbury identity 



Appendix F 



Proof of Theorem 8 



The following identities are used in this proof 

vec(A©B) = dvec(B) vec(A) , 
a T diag(b) c = (a © c) T b, 
(a (g) b) © (c <g) d) = (a © c) © (b © d) . 
Here, dimensions of a, b, c and d are assumed to match accordingly. 



(87) 
(88) 
(89) 
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The approximate Hessian in (|5TT) is given by 

H = 3^(0)3 W (0) = J(0) T dvec(W) 3(0), (90) 



where 3(0) is the Jacobian for the complete data. 
We have 

- ( <g) ^W^fga^l , (91) 

\fc=n+l / \fc=l 



<9aL 



(n) 



where unit vector e|"^ for i = 1, 2, . . . ,/ n is the i-th column of the identity matrix of size /„ x I n . 
An (i, j) entry of a sub matrix H^" for i = 1, 2, . . . , I n , and j = 1, 2, . . . , I n is given by 



= ^c(W) 



T 



' AT \ /n-1 

4 fc ) © a( fe ) © (ej n) © ef ) © (g) a« © a« ) ) vec(W) 

vfc=n+l / \k=l ) ) 

» 



= W x„ n {b( fc )} x n 5 lje f\ (92) 

where Sij is the Kronecker delta, = a£ © a^ , for n = 1, . . . , N. This leads to that a diagonal 
sub-matrix Hr™/ 1 ^ is a diagonal matrix as in Theorem |W] 

For off-diagonal sub matrices H^'™' of size I n x I m (1 < n < m < N), we have 

T 



'TV \ / ro-1 \ 

4 h) © a( fc ) © (a(, m ) © ef >) © (g) a« © a« © 

vfc=m+l / \fc=n+l / 
/n-1 \ \ T 



a )r ' «£' I Wx. M {b^} x n e 4 w x m ef 1 . (93) 



ej n) © a£ n >) © f (g) a« © a«j 1 vec(W) 
(m) (n) / ln - ri_CfclT ~ ( n ) ~ ( -m ) 

This leads to the compact form in Theorem 8. ■ 
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